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We review statistical-mechanical theories of single-molecule micromanipulation experiments on 
nucleic acids. First, models for describing polymer elasticity are introduced. We then review how 
these models are used to interpret single-molecule force-extension experiments on single-stranded and 
double-stranded DNA. Depending on the force and the molecules used, both smooth elastic behaviors 
and abrupt structural transitions are observed. Third, we show how combining the elasticity of two 
single nucleic acid strands with a description of the base-pairing interactions between them explains 
much of the phenomenology and kinetics of RNA and DNA 'unzipping' experiments. 



I. INTRODUCTION 

Single- molecule studies that provide information on properties of one or a few interacting biomolecules are becoming 
increasingly important in biophysics. The precision of control and quantitative measurement, and simple interpreta- 
tion of these experiments, make detailed theoretical analyses appropriate. The theory of single molecule microma- 
nipulation experiments is a new development of polymer physics, emphasizing the structural richness of biopolymers 
(inhomogeneity of sequence, sequence-specific monomer interactions, transformations of secondary structure...). Both 
equilibrium and non-equilibrium aspects of single-molecule experiments reveal new basic physical problems. 

This review presents some of the theoretical ideas that have been useful for description of single-molecule microma- 
nipulation studies of nucleic acids. First, models useful for describing biopolymer elasticity will be presented. We will 
then review how these models are used to interpret single-molecule DNA force-extension experiments, which show 
both smooth elastic behaviors and abrupt structural transitions. Third, we will show how combining the elasticity of 
two single nucleic acid strands with a description of the base-pairing interactions between them explains much of the 
phenomenology of RNA and DNA 'unzipping' experiments. 

The theoretical studies that we review use a wide range of tools and concepts from statistical mechanics and quantum 
mechanics. Single molecules are composed of a large number of elementary units (monomers). The nearest-neighbour 
character of the interactions between monomers often leads to partition functions with the form of path integrals (the 
curvilinear coordinate plays the role of time) which can be analyzed using the tools of quantum mechanics. Ideas 
from the theory of phase transitions are also extensively employed, for example to describe the abrupt, first-order- like 
structural changes frequently observed in stretching experiments. The kinetics of such transitions are thus related to 
problems from non-equilibrium statistical mechanics. 

II. THEORETICAL MODELS OF FLEXIBLE POLYMERS 

A number of polymer models have been used to model single-molecule experiments. Here we focus on applications 
relevant to double-stranded DNA, which is important biologically and also nearly ideal as an object for theoretical 
study. 

A. Gaussian Model 

The simplest description of a polymer is the Gaussian polymer (GP) model which essentially considers a 

polymer to be a series of particles joined by Hookean springs (Fig. 1). Let us call r n = (x n , y n , z n ) the location of 
monomer n. The vector leading from monomer n — 1 to monomer n, v n — r„_i, obeys a Gaussian distribution of 
average zero and variance < (r„ — r„_i) 2 >= b 2 , 




FIG. 1. Mathematical representations of polymeric chains. GP: the Gaussian polymer is made of N monomers represented 
by harmonic springs. FJC: the Freely Jointed Chain is composed of N bonds of fixed length b, with no correlation between the 
orientation of adjacent segments. WLC: the Worm Like Chain is a continuous model, characterized by a persistence length A; 
the orientation of the chain tangent t(s) is changing appreciably over contour lengths greater than A. WLRC: the Worm Like 
Rod Chain is described by a rotating three-dimensional coordinate system, with local triedron (t, 111,112), along the curvilinear 
coordinate s. 



where the index n runs from to N. 

When submitted to a force / along the z direction, the Hamiltonian thus has a Gaussian elastic term, and a 
force-distance term, 
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where the force is taken to act in the z-direction. The statistics of the z-component of the end-to-end vector are easily 
computed; for example the average end-to-end distance as a function of the force is 

Nb 2 

The Gaussian polymer as a whole behaves as a Hookean spring of zero rest length and stiffness C = iksT/Nb 2 
proportional to the temperature and inversely proportional to the length N . This effective elasticity is a model for 
the entropic elasticity resulting from the decrease in the number of a polymer's configurations as is it extended. This 
basic picture of flexible polymer elasticity is the basis of rubber elasticity and a starting point for polymer physics [Q . 



B. Freely- Jointed Chain model 

The GP has the unphysical feature that it can be indefinitely extended, and is therefore useful only for weakly 
stretched polymers, and even then only when physio-chemical details of the monomers are not of interest. A model 
which corrects the indefinite extensibility problem but which is still elementary and in wide use is the Freely Jointed 
Chain model Jl],||], which consists of N bonds of fixed length b (the Kuhn length), allowed to point in any direction 
independently of each other (see Fig |l]). When under zero force, the mean-squared end-to-end distance is R 2 = Nb 2 , 
the familiar result for a random walk of N steps. 

When this chain is subjected to a force /, the bonds tends to align along the force, as dipoles in an electric field, 
with an energy 
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where 6 n is the angle between the force and the n th bond directions. Since the segment orientations are decoupled, 
the partition function is easily calculated. The mean average end to end distance when a force / is applied is 



< z >fjc= Nb < cos 9 >= Nb 
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The small force behavior coincides with that of the GP expression (^). However, (||) departs from the GP at large 
forces, since the FJC model properly takes into account that the extension of the molecule cannot exceed the total 
contour length Nb. For large /, < z > F jc /[Nb] sa 1 - k B T/[bf]. 



C. Worm-Like Chain 



The Worm-Like Chain (WLC) is a continuous model without the sharp bends of the FJC. The chain is described 
by its unit tangent vector t(s), as a function of contour length s along the chain. If no forces are applied, the tangent 
vector is presumed to undergo Gaussian fluctuations with zero mean and variance < (dt/ds) 2 >= 1/A (Fig. 1). The 
energy for this model in presence of a force /, is given by 
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H WLC = k B T- dsl — ) -f I (z.t(s))ds (6) 



The first term is curvature energy that accounts for the resistance of the chain to bending, and the second term is 
the stretching energy due to application of the external force /. The partition function of a chain of length L, with 
tangent vectors at extremities t(s = 0) = to, t(s = L) = ti, can be written as a path integral, 

Z(L, /, t , ti) = [ Dte - H »'tc/k B T (?) 



The significance of the parameter A is made clear by the expression of average scalar product between tangent vectors 
at coordinates s and s' at zero force, 

<t(s)-t(s') >=exp(-|*-*'IM) ( 8 ) 

Therefore A represents the characteristic distance above which tangent vectors decorrelate; A is called the persistence 
length. 

For a long chain under zero tension the WLC mean-squared end-to-end distance is R 2 — 2AL for L >> A (the 
formula for general L is often useful, see ||). Therefore the unperturbed random coil properties of the WLC are 
equivalent to those of the FJC and GP if we make the identification 2A = b and L = Nb. The unstretched WLC on 
large scales becomes a random walk of N = L/(2A) steps each of length b = 2 A. 

From a physical point of view, the FJC represents N uncorrelated dipoles in an electrical field, and the average 
orientation of one dipole at equilibrium is obtained by classical statistical mechanics (eqns |], ||). By contrast, the 
WLC describes, the "time" evolution of a dipole with moment of inertia A in an electric field, with the role of time 
played by the contour length coordinate s. The introduction of the time dimension makes WLC equivalent to a 
quantum mechanical problem. The Schrodingcr equation for the associated wave function can be analytically solved 
for small and large force limits, and can be numerically solved for general force f|,§]]. 

At small forces << k B T '/ A the Hookean behavior (j^) is recovered (i.e. with b — > 2A and N — > L/b) while for large 
forces < z > W lc /L « 1 - yJk B T/(AAf). 



III. ELASTICITY OF DOUBLE- AND SINGLE-STRANDED DNAS: EXPERIMENTS AND THEORY 

A. Double-stranded DNA (dsDNA) under tension 

Reviews of experiments and theory on dsDNA elasticity can be found in [|| j8). In Fig. 2 we report the force 
extension curve of a single dsDNA. Experimental data obtained by 0,^4) for A-DNA of length 48502 bp, or L = 16.4 
microns are shown, with fits of the GP, FJC, and WLC models with A = 6/2 = 53 nm in 10 mM Na+ buffer. For 
dsDNA, the Kuhn length b = 106 nm is much larger than the natural base pair spacing of 3.4 A. dsDNA is not 
naturally described by the FJC model because consecutive bases, stacked onto each other, are not free to reorient 
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FIG. 2. Force-extension curve for a A-dsDNA with equilibrium contour length L = 16.4/im in lOmM Na + buffer. Exper- 
imental data (x) are from [24] and (+) from [14]. The plateau at / ~ 70 pN indicates the cooperative transition to SDNA. 
The extensible WLC model with persistence length A — 53 nm, Young modulus 7 « 1000 pN, and two states dsDNA-SDNA 
reproduces accurately the experimental behavior [23]. Marko and Siggia's interpolation formula [3] (^|) is very accurate up to 
forces of 10 pN. Predictions from the GP and FJC (b = 2A = 106 nm) models are plotted in the Inset. 



independently of each other. The success of the WLC shows that dsDNA behaves as a semiflexible polymer, with a 
bending modulus A k B T. 

Several analytical interpolation formulae for the WLC, and modifications of the FJC introduced to fit accurately 
the data are discussed and compared in [fz|Jl0|| . Marko and Siggia proposed a simple interpolation formula that is close 
to the exact numerical solution of the force-extension response of the WLC @||, 
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This expression reduces to the exact solution as either z — > or z — > L, but differs from the exact solution by up 
to w 10% near / = 0.1 pN (Fig 2). Bouchiat et al. jl0| have introduced correction terms to eqn (g), in the form 
of a seventh order polynomial in z/L. The resulting approximation for fwLc( z ) is accurate to 0.1%. According to 
formula (||), a force /n = 3k B T/b is required to extend dsDNA by a fraction of its contour length; from b ~ 100 nm 
we see that the characteristic force associated with the entropic elasticity of dsDNA is /o ~ 0.1 pN. 

Fig 2B shows that experimental data are well fitted by the FJC model for forces / < 0.1 pN, and by the WLC up 
to / < 5 pN. Various experiments analyzed in terms of the WLC give A = 50 ± 5 nm in 10 mM Na + 0,0. The 
persistence length of DNA is reduced in high salt concentrations by electrostatic screening of the repulsive charge 
along the backbone; electrostatic effects have been taken into account in the WLC model by Barrat and Joanny 
through Debye-Huckel interactions |T^,||,|l3|,f7| . 

Fitting larger-force experimental data demands the introduction of the stretching elastic modulus of the molecule, 
7 ~ 1000 ± 200 pN, quantitatively consistent with the relation between the bending modulus, Ak B T and the Young 
modulus Y = -f/(irR 2 ) ~ 300 MPa (R — 10 A is the double helix radius) for an elastic rod, 

A = — — — Y i? 4 = (10) 

The value of the elastic modulus of DNA indicates that thermal fluctuations of the axial base pair distance h of the 
order of < (h— < h >) 2 >~ fcsT/(7 < h >) ~ 0.14A This order of magnitude is in agreement with molecular 
dynamics simulations, providing a consistent picture of the elasticity at the atomic and mesoscopic scale JTfl]. Axial 
vibrational modes have been studied in jlq] and compared to Raman spectroscopy measurements. 
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FIG. 3. Phase coexistence in a supercoiled DNA under tension. A fraction x of the length is plectonemic supercoil with 
radius R and pitch P, while the remaining fraction 1 — x is in an extended conformation. 



When dsDNA molecule is subjected to a force of / = 65 pN it undergoes a structural transition to another 
conformation, S-DNA, with a contour length 1.7 times larger than its B-DNA counterpart fl9| , pT 14|. Numerical 
investigations of the structure of S-DNA have been performed by Lavery and collaborators |l4|]l5| ]. The force plateau 
around / = 65 pN corresponds to a highly cooperative transition, reminiscent of a first order phase transition. A two 
state model proposed by Cluzel et al. [ pTpij l , inspired from models introduced in the context of thermally-induced 
denaturation |I^-|2l|], is able to reproduce the B-to-S transition. A recent study has suggested that the extended S 
state is actually strand-separated with the S phase described as stretched ssDNAs |58| . 

In the simplest two-state model of the B to S transition, the molecule is described as a chain of N elements (base 
pairs), which can be in states B (energy Eb, length Is) or S (energy Eg, length Is > Is)- Each element is associated 
a spin variable, s = 1 (respectively —1) for the B (resp. S) state. The energy of the chain reads, 



N 



H = ^(E Si - fl Si ) + -J2 (1 - * S ^ + l) 
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uj represents a 'domain wall' energetic cost of a B-S frontier. Up to an additive constant, 
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where AE = E$ — Eb and Al = Is — Ib- Notice that eqn ( |12| ) is simply the Hamiltonian of a one-dimensional Ising 
model with magnetic field h = AE — f Al. The extension is obtained from the derivative of the free energy with 
respect to the force /. Comparison with experiments allows to determine quantitatively the domain wall energy, 
uj ~ 4fcsT jl6|]. The extensible WLC including nonlinearities which define two states of extension provides a way to 
fit the force-extension curve over a wide range of forces 0.01 < / < 100 pN p2|. 



B. Supercoiled DNA under tension 

dsDNA differs from simpler polymers because it exhibits torsional and bending stiffness. Try to impose a twist to 
an elastic rod while keeping it extended and fixed at one end. Then, if you relieve the tension, an interwound structure 
called a plectoneme will appear, Fig|^ (twisted telephone cords often form plectonemic supercoils). Similarly, dsDNAs 
under sufficient torsional stress interwinds to form plectonemic supercoils. Formally, the over- or underwinding of 
DNA is measured by the change in double-helical linking number. This is often expressed as a, the fractional change in 
linking number relative to that of relaxed dsDNA (one right-handed, or positive link per 10.5 base pairs). Supercoiling 
of DNA is of tremendous importance to eubacteria. For example, in E. coli all the DNA is held under torsional stress 
and is topologically constrained with a w —0.06. 
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The elasticity of a single supercoiled DNA molecule has been experimentally measured by Strick et al. |2^,|24| and 
by Leger et al. pEfl , A rich behavior was observed. At small forces the molecule responds to increasing positive or 
negative supercoiling by first having its conformations slightly chirally perturbed, and then by forming plectonemes 
with appreciable shortening of its length. At forces / > 0.5 pN, negative supercoiling is released through the opening of 
the double helix into denaturation bubbles. At forces / > 3 pN, positive supercoiling induces the formation of regions 
exhibiting a new conformation called P-DNA. The structure of P-DNA has been deduced by molecular modeling |^6j ; 
it is essentially characterized by its exposed bases. P-DNA can be thought of as two tightly interwound ssDNAs. 

The theory of stretched supercoiled DNA was initiated by Marko and Siggia p7p^ ], who considered phase coexis- 
tence of linear, plectonemic, and denatured DNA in different regions of a supercoiled molecule. The relative extensions 
of these portions are determined by the degree of supercoiling a and the stretching force /. 

At small forces, a fraction x of the molecule is in the plectonemic (p) regime, whereas the remaining 1 — x fraction 
is extended (s) (Fig ||). The free energy per unit of length T = F/L reads, 

F(a,z/L)=xf p (a p ) + (l-x)F s (f,a s ) (13) 

a = ALk I L\ is the excess of density of supercoiling with respect to the rest configuration (dsDNA making a double 
helix turn in 10.4 bases, L\ = L/10.4); a is partitioned into extended and plectonemic regions: a — xa s + (1 — x)a p . 
The free energy of the extended phase equals the WLC free energy plus the twisting energy, T s {f, cr s ) = J~wlc(J) + 
UbT C L 2 ,/(2L 2 ) (2tt(j s ) 2 . C is the twist persistence length known from experiment to be C = 75 ± 30 nm. 

The free energy of the plectonemes is thus considered to be the sum of elastic (bending and twisting) , electrostatic 
and entropic contributions, minimized over plectonemic parameters e.g. the pitch P and radius R (Fig |3|). The 
entropic term represents the entropy lost by confining the DNA in the superhelix formation. The total free energy 
is obtained by minimization with respect to the plectonemic portion x. At large forces the structural transition to 
denatured DNA is also included in the model by allowing the plectonemic phase to be a mixture of denatured and 
normal plectonemic DNA. The theoretical force-extension curve at fixed supercoiling reproduces the experimental 
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A semi-microscopic model has been used to describe thermal- and torque-induced denaturation in one phase diagram 
p9| . This work described the formation of denaturation bubbles when DNA is stretched at / > 0.5 pN and negatively 
supercoiled. The critical torque at room temperature T « —2ksT is in good agreement with the value inferred from 
the experiments by Strick et al. p4[ . 

A generalization of the two-state Ising description ([l2]) of the overstretching transition has been introduced by 
Sarkar et al. |30| to model the structural transition of a twisted and stretched DNA molecule observed in pq ]. For 
each site five possible state are introduced: dsDNA, S-DNA, P-DNA, sc-PDNA (a supercoiled P state), and a left 
handed double helix Z-DNA. This last state, with a supercoiling degree <tz = —1.3, is proposed as an alternative 
to denatured ssDNA (a = — 1). A force-torque diagram is derived that agrees with the experiments on the critical 
unwinding torque at zero force, ~ — 2fc^T, and the torque to drive DNA into the P structure, ~ 7ksT. 

The elasticity of supercoiled DNA has also been studied at a more microscopic level. Twisting and bending 
deformations can be described by extending the WLC to include description of base-pair orientation using a triad of 
unit vectors (WLRC) (Fig 1) jn]. Moroz and Nelson [p2| , and Bouchiat and Mezard (3^] have written the partition 
function of this model as a path integral in the space of the Euler angles parametrizing the orientations, limiting 
the integration measure to the paths with a fixed linking number L^. The free energy is obtained from the ground 
state energy of a Schrodinger equation describing a particle moving on a unit sphere in the presence of electric and 
magnetic fields. 

The WLRC model does not take into account self avoidance: the WLRC chain is a phantom chain that can pass 
through itself. The result is that linking number fluctuations are not well-defined in the continuum limit. This 
problem is reflected in a divergence of the ground state energy. This is strictly a technical problem since real DNA 
has self- avoidance interactions. To avoid this problem, Moroz and Nelson considered the unambiguous infinite force 
situation (fully stretched molecule), and obtained finite force results by means of perturbation theory. Bouchiat and 
Mezard have introduced a short distance cutoff (discretization of the chain) to suppress the singularity. Fitting the 
theory to experimental data ^3|, the twist persistence length C can be determined but is largely dependent on the 
theoretical scheme followed: C = 120 nm is obtained by Moroz and Nelson, while C = 82 nm is obtained by Bouchiat 
and Mezard for a cutoff length of 7 nm. 
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FIG. 4. Top: Force extension curve for a A-ssDNA with equilibrium contour length L ss — l sa N = 0.56 nm x 48502 bp — 27/j.m 
in 150 mM Na + . Experimental data (+), dashed line: FJCL with b = 1.5nm , dotted line: extensible FJCL with 7 = 800 pN, 
from[ll]. Bottom: Force extension curve for a Charomid ssDNA in 10 mM phosphate buffer, 5 mM Mg ++ buffer, data are 
from [38], the fits are with the FJCL with b = 1.9 nm and 7 = 800 pN (dotted line) and with the hairpin model (full line) of 
[41]. 



C. Single-stranded DNA under tension 

Single-stranded DNA (ssDNA) is more flexible and can reach a larger extension per base pair than dsDNA. A 
sensible simple model of ssDNA is, at first sight, a FJC with a Kuhn length equal to the sugar-phosphate monomer 
backbone length b = 7 A. However the ssDNA elasticity is complicated by nucleotide interactions, and as a result 
simple polymer models do not describe ssDNA elasticity over a wide range of forces. 

For forces / < 20 pN the experimental force extension curve for a 48502 base A ss-DNA in 150 mM Na + has been 
fitted with a FJC-like (FJCL) model by Smith et al. |11| with two effective parameters: a Kuhn length d = 15A and 
a contour length per base pair l ss = 5.6 A that differs from the backbone distance (see Fig. 4). Note that due to 
the higher flexibility, the characteristic entropic force of the single strand, /n = 3k bT/ y/bl ss ~ 10 pN (||), is much 
higher than for dsDNA. At forces / > 15 pN the fit requires the introduction of a stretching modulus 7 = 800 pN 
(Section ing ). 



ssDNA elasticity depends strongly on salt concentration. At low salt (1 mM Na + , self avoidance interactions 
due to electrostatic self- repulsion along the charged sugar-phosphate backbone occurs. The experimentally observed 
logarithmic-like dependence of the extension upon force is well reproduced by Monte Carlo simulations [ pi], 35[ . Elec- 
trostatic selfavoiding effects can be taken analytically into account using Barrat and Joanny formalism |36 ] , or with 
a Hartree-Fock calculation from the WLC models [B7L 

At higher salt concentration (> 100 mM Na + , or in presence of Mg ++ ) formation of secondary structure ('hair- 
pins') by hydrogen bonding between complementary bases on the same strand strongly influences elastic properties. 
Experiments show that the force-extension behavior curve depends on the strand GC vs. AT content, and can be 
modulated using denaturing chemical agents that suppress hydrogen bonding [p8| , p4| . A theoretical analysis of the 
elasticity of a polymer with hairpin secondary structure has been developed by Montanari and Mezard f42| . Confor- 
mations of hairpins taken into account are such that any two pairs of paired bases (i < j, k < I) are independent 
(i < j < k < I), or nested (i < k < I < j) f39|Jto|| . This representation do not include pseudoknots [|l] but leads to a 
solvable recursion relation relating the partition functions of successively larger sequences |3£j , under the simplifying 
hypothesis that any two bases e.g. AT, GC, AG, ... have the same pairing free-energy. For an infinite molecule, a 
phase transition takes place between a folded (zero extension, / < f s ) and an extended phase (/ > / s ) with f s of the 
order of 1 pN, given reasonable choices of parameters (see Fig. 4). 
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FIG. 5. Sketch of a DNA molecule with n base pairs unzipped, as a result of a mechanical stress (applied force /). 
distance between the two ssDNA ends is defined to be 2r. 
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D. Elasticity of DNA in presence of DNA-folding proteins. 

In eukaryote cells, DNA is wrapped around octamers of histone proteins to form a more compact structure called 
a nucleosome. The long chromosomal DNAs of eukaryote cells are thus organized into long strings of nucleosomes, or 
'chromatin fibers'. A single eukaryote chromosome may contain more than 10 s base pairs of DNA and roughly 10 6 
nucleosomes. 

The elasticity of chromatin fiber has been experimentally studied by Cui and Bustamante [0]. The experimental 
curves can be fitted with polymer models composed of units which, independently of each other, can be in a folded 
(short) or unfolded (long) state jl||,[36). These states are taken to correspond to stacked and unstackcd nucleosomes. 
The elastic response of whole mitotic chromosomes can be related back to this fiber elastic response |3rf |. 

Very roughly, models of DNA folding by proteins will generally show a characteristic force at which the proteins 
will dissociate in equilibrium |Q . Given a free energy difference between the folded and unfolded states of g per fold, 
and given an end-to-end length reduction of d, this characteristic force will be about g/d. Note that for large values 
of d (e.g. by formation of large DNA loops, a common feature of gene-regulatory proteins) this implies low on-off 
equilibrium forces. It must be kept in mind that if the enthalpic component of the binding free energy is large, there 
may be large barriers for such loops to open and close, making equilibrium difficult to reach. Such situations should 
show theoretically interesting many-body thermal barrier-crossing kinetic phenomena. 



IV. DNA AND RNA UNZIPPING 



Essevaz-Roulet, Bockelman and Heslot have shown that the two strands of a dsDNA can be pulled apart by a 
force m 12 pN @ (Fig. |). Variations of the 1 unzipping' force correspond to the DNA sequence, through the known 
relationship between DNA sequence and base-pair interaction strengths . Higher forces were shown to correspond 
to DNA regions with higher GC densities, which in general have stronger base-pairing interactions than AT-rich 
sequences. Experimental unzipping force traces show a series of sawtooth signals attributed to stick-slip motion, with 
the sticking generated by DNA regions with higher GC content. This kind of experiment amounts to 'feeling' DNA 
sequence. 

Current techniques are able to observe GC-content over long stretches of DNA (« 10 kb) with about 10 base pair 
resolution. It should be noted that it has been demonstrated that unzipping is sensitive to at least some single-base 
substitutions IpSfl . We now discuss some equilibrium and dynamical aspects of DNA and RNA unzipping. 
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A. Thermodynamics of DNA-RNA unzipping 



Control parameters for unzipping vary from experiment to experiment. Roughly speaking, either the force or the 
distance between strand extremities may be kept fixed (Fig. pi). 



1. Fixed pulling force 

With a fixed force / on the molecule ends, the free energy G of the molecule with n base pairs opened is the 
difference between the free energy of the two extended single strands of n bases each, and the free energy lost in 
impairing the n first base pairs (i = 1, . . . , n): 

n 

G(f,n) = 2nF ss (f)-^2 g ds (i) (14) 

As discussed above, the free energy per base pair of stretched ssDNA, J r ss (f), can be expressed using the FJCL model 
for forces / ~ 10 pN |J. At this high tension, nucleotide hairpin-formation effects are absent. Quadratic expansion 
of J r ss (f) around / = 10 pN gives the free energy of a Gaussian polymer, Ff s p {f) — — f 2 b 2 / (6k B T) with an effective 
Kuhn length b = 7A. 

We start by considering an homogeneous sequence, where all base pairs have pairing free energy gd s = —go- The 
unzipping critical force /„ is simply given by the condition 

G(J, n) = n g(f) = [2 T ss {f) + g ]n = (15) 

For / < / u , dsDNA is stable, and if / > f u , the double helix unzips as in a first order phase transition. Using the 
Gaussian model for the ssDNA, we obtain f^ p — ^/3k B Tgo/b. The unzipping force of a homogeneous sequence is 
therefore directly related to the pairing free energy. 

Rief et al measured the unzipping forces f u for DNAss of various repeated sequences p6j. It was found that 
/„(poly dA- dT)=9±3 pN and /„(poly dG-dC)=20±3 pN, giving gg c '{A — T) = 0.8 k B T, g^ CL {A — T) = 1.1 k B T 
<7o (G — C) = 4.2 k B T, and go' JCL {G — C) = 3.5 k B T respectively. These values of the free energies of denaturation 
are compatible with thermodynamic data based on DNA melting Q|. It is to be noted that unzipping experiments 
give the only direct measurement of the relative free energies of ss and dsDNA at equal temperatures. 

Unzipping has been discussed in the language of continuous phase transitions by Lubensky and Nelson |4§| , f49]| . The 
unzipping free energy per base pair, <?(/), vanishes as /„ — / with a discontinuous slope, as in a first order phase 
transition. However (as in interfacial wetting) as the unzipping transition is approached from below i.e. f — > f~, 
the molecule the average number < n > of open base pairs undergoes a continuous power-law divergence. From the 
probability to have n open base pairs, P(n) = g(f)/(k B T) exp[— (n g(f))/ (k B T)], one obtains 

k B T 1 , s 

g{j) fu- f 

When the DNA molecule is subjected to a torque T, a torque-angle work contribution occurs in the unzipping free 
energy, 

g(f,T)=2f ss (f)+g -G T . (17) 

Go = 27r/10.4 is the change in twist angle during conversion of dsDNA to separated strands p4| . The phase diagram 
for the unzipping transition in the force, torque plane is shown in Fig. |(| 

Along heterogeneous sequences, the free energy to open the first n base pairs, G(f, n) (|lj), can be calculated using 
the sequence dependent pairing free energy gds{i) (e.g. from the Mfold server |50| ). In Fig. we show G(f,n) for 
the RNA molecule called P5ab, mechanically unzipped by Liphardt et al. |jlj with a force maintained fixed at the 
extremity through a feedback mechanism. The critical force is defined by the condition that the closed and open state 
have equal minimal free energies. Contrary to the homogeneous case, the free energy landscape at the critical force 
is not flat. It is characterized by high energy barriers G* ~ lOfc^T. The probability to have n open base pairs is 
essentially zero if n differs from the open and the closed configuration (Fig [?J) . Indeed, experiments show ]5l| that at 
the critical force the molecule essentially hops between open and closed configurations. 

Lubensky and Nelson Jl8|,[l9| have shown that the critical behavior around f u changes for large random sequence 
with respect to homogeneous sequences. Instead of the divergence l/{f u ~ f) for the averaged number < n > of base 
pairs, a stronger singularity < n >« !/(/« — f) 2 appears. 
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FIG. 6. DNA unzipping phase diagram as a function of torque F in units of ksT and force / in pN, from [54]. The solid 
line shows the results for the FJCL model of ssDNA elasticity, while the dashed line shows the result within the Gaussian 
approximation. At zero torque, the unzipping force is /„ ~ 12 pN; positive torque increase f u , negative torques reduce f u until 
it vanishes at T = — 2Ak B T. 



2. Fixed distance between extremities 

If the ssDNA ends are held apart at some distance 2r (Fig. ||), some average number of bases n will open. In 
the ideal case of a rigid opening device, the free energy cost to open n base pairs is a sum of chain stretching and 
denaturation contributions, 

n 

H d (r,n)=W ss (2r,2n)-J29ds(i) (18) 

t=i 

W ss (2r, 2n) is the work done by the force to stretch 2n base pairs of ssDNA at a distance 2r. For simplicity we 
consider the GP free energy (||), considering as in the previous section the effective Kuhn length b = 7 A , from the 
interpolation formula (j^) fl52| , or from numerical inversion of the FJC extension vs. force (||) ]5q] . Thus, unzipping 
at fixed extension can bedescribed using 

W? a p (2r,2n) = 3k B T ^- (19) 
n b z 

The most probable value of the number of opened base pairs n is obtained by minimization of the free energy (18 |l9|) 
with respect to n. The number of unzipped based pairs is found to scale linearly with the distance, n(r) — r/d u where 
d u = v <7o/3fcsTd = 5A is the projection of the monomer length along the force direction. The resulting free energy 
reads 

T{r) = 2 ^3k B Tg Q r/b = 2 n(r) g (20) 



The tension / in the chain is simply the derivative of T with respect to 2r, / = ^/SksTgo/b = 12 pN. This simple 
calculation shows that as unzipping proceeds quasi-statically, the ssDNA tension is a constant, just f u . Note that the 
excess free energy per unpaired base for fixed extension is double the free energy of denaturation because the work 
done extending the ssDNAs adds to the work done when opening the molecule. This indicates a strategy to determine 
gds unambiguously from unzipping force data. 

The analysis of the fluctuations around the minimum free energy gives (/ — /«,) = 0(l/r 2 ). Note that in the 
constant-force ensemble result, / — f u w 1/ <r> |49[] . The fixed-distance and fixed- force ensembles are equivalent 
only in the thermodynamic limit r — > oo. 

The unzipping force at small distance between extremities is sensitive to the detailed structure of the pairing 
potential as a function of the interbase distance. The semimicroscopic model introduced in [jw|,^5| predicts a force 
barrier of ~ 300 pN at a distance 2r ~ 21A , at which the hydrogen bond are broken but the bases are still stacked 
in the double helix configuration. This force barrier might be directly observable in an AFM experiment. 

To take into account the experimental apparatus, Bockelmann et al. have included in their theory the effects of 
the two dsDNA linker arms of A^ s base pairs and extension r^), and the cantilever stiffness [ |53"|]59| ]. The free energy 
(pi) is then 
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FIG. 7. Top: Free energy (in fcsT) and probability distribution of the opening fork at the critical force / = 15 pN for the 
P5ab molecule (middle) from [59]. Bottom: Schematic representation of the free energy landscape for a displacement of 4970 
nm (left, corresponding to slip phase in the force) and of 5050 nm (right, corresponding to the stick phase in the force) during 
the unzipping of a A-DNA (from [58]). 



H2(r,r ss ,r dsi n) = W ss (2r ss ,2n) + W ds (r ds ,N ds ) - 2^9da{i) - k lever (r - 2r ss - r ds ) . (21) 

i=i 

The partition function is obtained by summing over all possibles value of n,r ss ,r ds . The free energy g ds (i) can be 
computed using e.g. the Mfold program for base-pair interactions [M. The ssDNA, dsDNA and the lever can be 
considered as three springs in series, with an effective stiffness is fc t ~ t = k~ s x + fc^ 1 + fej^ . The spring constant 
of dsDNA, k ds ~ 0.03 pN/nm for a dsDNA total length of 15000 bases in the experiment of Bockelmann et at, is 
obtained from the derivative of f WLC (including the Young modulus) calculated at a force of 12 pN. The ssDNA 
stiffness is k ss ~ 6&bt/(& n) ~ 50/n pN/nm, and the cantilever stiffness equals ki ever — 0.25 pN/nm. For less than 
~ 1500 unzipped base pairs, k to t essentially reduces to the dsDNA linker stiffness. When more bases are unzipped, 
the dominant contribution comes from the ssDNA stiffness. 

The DNA sequence dependence results in a complicated free energy landscape that generates a 'stick-slip' variation 
of the force during unzipping p9[ . The stick phase corresponds to the presence of one deep minimum, and the slip 
phase to a flat free energy landscape see Fig 6. The analytical description of Bockelmann et al. predicts that the 
height h of the potential barrier increases as h » S 2 with the fluctuations S of the pairing free energy, and decreases 
h w kZz with the effective stiffness. 



B. Kinetics of unzipping 

The kinetics of unzipping at short length scales is affected by the presence of barriers with various physical origins, 
which makes it an activated process. 
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FIG. 8. Schematic representation of the saddle-point polymer conformation r*(n) which crosses the barrier of V(r); n and 
rf are the radii of the edges of the nucleation bubble. 



1. Homogeneous sequences and unzipping initiation barriers 

Unzipping of homogeneous DNA requires the crossing of a barrier whose physical origin is the following. We imagine 
r, the half distance between the two bases of one pair to be a reaction coordinate indicating whether the pair is bonded 
(r ~ lOA) or open (r > 11 — 12 A due to the very short range of H-bond). The effective free energy V(r) of the base 
pair as a function of r is shown in Fig. |[ It is low for both small (pairing energy) and large (entropy gain) values of 
r, and exhibits a maximum around r ~ 10. 5A where the H-bond is broken but bases are still stacked in the double 
helix conformation and are not free to move [5^ ]. The set of half distances r(n) between the two strands defines an 
abstract polymer (Fig. ||). At low enough forces this polymer is confined to the potential well (closed state). When a 
force / larger than f u is applied at one extremity, the polymer escapes from the well (unzipping). As in a first order 
phase transition, nucleation theory can be employed to understand the opening kinetics ]56| . 

The kinetics are slowed because of the activated crossing of the free energy barrier. A saddle-point calculation 
provides the optimal configuration of the polymer for crossing the barrier |54} ]. This configuration is made of a 
transition 'bubble' of a few ~ 4 bases long, and free energy cost G*(f). The time of unzipping initiation grows 
exponentially with the activation free energy G*(f), t(f) = t cxp(G*/fcsT). The elementary time to corresponds to 
the time necessary for the polymer to escape from the saddle-point configuration along the unstable direction in the 
free energy landscape [ |56| |. When the applied force is smaller than f u , the molecule may still unzip. The opening 
time, which still depends on the barrier free energy, is now exponentially large in the equilibrium free energy Ng(f) 
where N is the number of base pairs. 

The determination of the opening time t(f) at fixed force provides in turn the distribution of unzipping forces when 
the molecule is loaded with a fixed rate j57| . The most probable unzipping force exhibits a rich pattern depending on 
the loading rate and on the length of the sequence |)4|,^5| . 



2. Sequence-dependent barriers 

The above mechanism is mostly relevant in the kinetics of initiation of opening of the double helix. During unzipping 
of a long double helix, the sequence dependent free energy landscape of Fig]?] with barriers of ~ lOfc^T is responsible 
for a slow hopping dynamics between the open and closed states, i.e. the slip-stick behavior observed by Bockelmann 
et al EBpl. 



In experiments by Liphardt et al [|l|, small helix-loop RNA structures (essentially short regions of double- helical 
RNA terminated with a short loop) were held in such a way that equilibrium fluctuation between open and closed 
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states occurred. The timescale observed was close to 1 sec |5lj], remarkably long for a few-nm-long molecule. In |6C| ] 
a dynamical model was introduced for the motion of the 'fork' separating the base paired and opened regions of the 
molecule, allowing computation of the opening and closing rate as a function of the force. The model, which describes 
the experimental data well, is based on the following elementary rates of opening and closing base pair n 

r (n) = r e -»( n >/ fcflT , ^ ^ = f e -2? M ,(/,») ( 22 ) 

The opening rate is taken to depend only on the pairing free energy since the short range hydrogen bond is broken 
before the force-length work over the longer « 0.7 nm distance can be done. Conversely, to close the base pair, work 
must be first done against the applied force, and so the closing rate is taken to be depend only on the force. The 
separation of length scales of the range of base pair interaction and base pair extension after unzipping is thus used 
to justify placing most of the force-dependence in the zipping rate, with most of the interaction dependence in the 
unzipping rate. More sophisticated rate models will require further experiments to determine their form. 



V. CONCLUSION 



We have presented a very brief overview of the theory used to think about single-molecule nucleic acid microma- 
nipulation experiments. The field of single- molecule experiments is evolving so rapidly at present that we have been 
forced to omit many exciting topics. Here we present a few general comments about what has been learned and 
suggest some directions that might be particularly interesting for study in the near future. 

A feature common to all the studies described above, and to the theory of other types of single-biomolcculc 
experiments, is the central role of statistical mechanics. The interaction of this field with statistical mechanics is 
fundamental: the understanding (in some cases, even the primary data analysis) of single-molecule DNA experiments 
requires statistical mechanics. Additionally, previously unimagined statistical-mechanical problems are resulting from 
the huge range of experimental possibilities for DNA and DNA-protein micromechanical experiments. 

The first phase of single-DNA experiments involved basic characterization of dsDNA and ssDNA, and from the 
theoretical side involved development and solution of statistical-mechanical theories for the molecules subjected to 
stresses. The studies reviewed above essentially fall into this first class, and are characterized by a degree of quanti- 
tative success, thanks both to the efforts of experimentalists and theorists, which is unprecedented in soft condensed 
matter physics. 

The second phase, which we are in at present, involves the study of modifications of the basic molecules, e.g. by 
unzipping, or by action of proteins acting on DNAs under mechanical control. The statistical mechanics of this second 
class of problems is less well developed, and includes problems far from thermal equilibrium such as DNAs acted on 
by processive, ATP-powered motor-like enzymes. The diversity of challenging and new statistical physics problems 
in this second class is mind-boggling. The second phase is also forcing theorists to confront the information content 
of nucleic acids since sequence plays an essential role in targeted nucleic acid-protein interactions. 

The third class of problems involves applications of the lessons learned, to the description of cell machinery in 
vivo, or at least under in-vivo-like conditions. Such experiments are in their infancy, and extension of theoretical 
physics into this domain is still in a dark age. However, we can look forward to a time when we understand processes 
such as cell division and growth, gene regulation, and other cell biological functions in statistical-mechanistic terms. 
The lessons being learned now about the importance of statistical mechanical ideas to biochemical-micromechanical 
experiments on nucleic acids will thus become an important component of future quantitative understanding of cell 
biology. 
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